2025-03-21

Imagine you have a cat (or a dog)…

Imagine you have a cat (or a dog)… and they broke a frame

Priors

An opinion before data collection

Priors

We all have opinions, even before data collection

Prior distributions

Expressing uncertainty in prior guesses

  • How do you distinguish between no information and uncertain information?

  • How do you express your uncertainty in your initial guess?

  • Express prior belief as a distribution rather than a single value.

  • Call it your prior for short

Prior distributions

Representing priors graphically

  • Prior distributions are your entire belief state
  • Write it as \(\theta = \text{P(guilty)}\) (not just whether it broke it or not)

Are there priors in Frequentist statistics?

Beyesian statistics (generally) start with analyst-informed priors

 

Frequentist statistics:

  • Hypothesis testing \(H_0\), \(H_a\), and p-values
  • Ignore priors / always start with no prior information, i.e. the “flat prior”
  • Interested in likelihood of a parameter given the data: \(P(\text{data} | \theta) = \mathcal L(\theta | \text{data})\)
  • Not interested in finding the most likely value of \(\theta\)
  • \(\mathcal L(\theta | \text{data})\) is a function, not a probability distribution

Likelihood

The information in the data

Likelihood

Learning from data

  • Inspect the frame and the area around it:
  • Summarize all new information in a new probability distribution: \(P(D|\theta)\)
  • The likelihood:
    • Contains all information you gathered on your data, \(D\)
    • Gives \(P(D|\theta = \theta_i)\) for all possible values of \(\theta\)
  • \(p\)-values are the probability of observing \(D\) or a more extreme one

Likelihood

Summarizing your data

  • The nail gave out due to excessive weight
  • Other things are knocked over

Likelihood

What if it was just chance?

  • Maybe the frame just fell and scared the cat?
  • More probable if the cat is generally innocent.

Posterior

Your updated belief

Did the cat knock the frame off the wall?

  • Ultimately, we want the posterior probability of the cat being guilty, \(P(\theta | D)\)
  • Combine prior and likelihood using Bayes’ theorem:

\[\text{Posterior} = \frac{\text{Prior} \times \text{Likelihood}}{\text{constant}}\]

Did the cat knock the frame off the wall?

  • Ultimately, we want the posterior probability of the cat being guilty, \(P(\theta | D)\)
  • Combine prior and likelihood using Bayes’ theorem:

\[\text{Posterior} = \frac{\text{Prior} \times \text{Likelihood}}{\text{constant}}\]

CC BY 4.0: Maxwell B. Joseph

CC BY 4.0: Maxwell B. Joseph

Updating our belief

\[\text{Posterior} = \frac{\text{Prior} \times \text{Likelihood}}{\text{constant}}\]

How much does our prior affect our posterior?

\[\text{Posterior} = \frac{\text{Prior} \times \text{Likelihood}}{\text{constant}}\]

Expressing uncertainty

Show distributions, not point estimates

Expressing uncertainty with prior predictions

  • Prior, likelihood, and posterior are probability distributions
  • We can use them to express our uncertainty using intuitive measures of probability

 

Expressing uncertainty before data collection

Prior predictive intervals: range of credible (i.e., believable) values with some degree of credibility

 

E.g., 50% prior predictive intervals using 25% and 75% percentiles:

Expressing uncertainty before data collection

Prior predictive intervals: range of credible (i.e., believable) values with some degree of credibility

 

E.g., 50% prior predictive intervals using 10% and 60% percentiles:

Expressing uncertainty in the likelihood

Compatibility intervals: compatibility of the data with specific values of \(\theta\)

NOTE: this interval indicates a probability of the data, not of \(\theta\)!

E.g., 50% likelihood confidence interval using 25% and 75% percentiles:

Expressing uncertainty in your updated knowledge

Posterior credible intervals: range of credible (i.e., believable) values of \(\theta\) with some degree of credibility

E.g., 50% posterior credible intervals using 25% and 75% percentiles:

Summarizing the posterior distribution

\(\bullet\) posterior mean

\(\textbf{-}\) posterior median

\(\color{forestgreen}{\text{green lines}}\) 50%, 70%, 90%, and 99% credible intervals